21 research outputs found
Symbol detection in online handwritten graphics using Faster R-CNN
Symbol detection techniques in online handwritten graphics (e.g. diagrams and
mathematical expressions) consist of methods specifically designed for a single
graphic type. In this work, we evaluate the Faster R-CNN object detection
algorithm as a general method for detection of symbols in handwritten graphics.
We evaluate different configurations of the Faster R-CNN method, and point out
issues relative to the handwritten nature of the data. Considering the online
recognition context, we evaluate efficiency and accuracy trade-offs of using
Deep Neural Networks of different complexities as feature extractors. We
evaluate the method on publicly available flowchart and mathematical expression
(CROHME-2016) datasets. Results show that Faster R-CNN can be effectively used
on both datasets, enabling the possibility of developing general methods for
symbol detection, and furthermore, general graphic understanding methods that
could be built on top of the algorithm.Comment: Submitted to DAS-201
Deep Learning Multidimensional Projections
Dimensionality reduction methods, also known as projections, are frequently
used for exploring multidimensional data in machine learning, data science, and
information visualization. Among these, t-SNE and its variants have become very
popular for their ability to visually separate distinct data clusters. However,
such methods are computationally expensive for large datasets, suffer from
stability problems, and cannot directly handle out-of-sample data. We propose a
learning approach to construct such projections. We train a deep neural network
based on a collection of samples from a given data universe, and their
corresponding projections, and next use the network to infer projections of
data from the same, or similar, universes. Our approach generates projections
with similar characteristics as the learned ones, is computationally two to
three orders of magnitude faster than SNE-class methods, has no complex-to-set
user parameters, handles out-of-sample data in a stable manner, and can be used
to learn any projection technique. We demonstrate our proposal on several
real-world high dimensional datasets from machine learning
Bioinformatics and Biomedical Informatics
Published versio